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ABSTRACT 

The  present  study  examined  the  sensitivity  of  severai  candidate  metrics  of  reai-time 
workioad  within  thespatiai  component  of  an  unmanned  aeriai  vehicie(UAV)  task. 
Advanced  Brain  Monitoring's  (ABM)  wireiess  B-Aiert  system  was  used  to  coiiect 
participant’s  EEG  workioad  and  engagement  data.  Eye  tracking  data  was  aiso 
coiiected.  The  UAV  simuiation  required  participants  to  report  heading  information 
of  moving  vehicies,  as  seen  from  the  UAV.  There  were  four  biocks  of  difficuity, 
over  which  a  significant  performance  decrement  was  shown.  Additionaiiy, 
participants  rated  their  workioad  significantiy  higher  and  pupii  diameter 
significantiy  increased  across  biocks  of  increasing  difficuity,  as  weii  as  within  each 
biock  during  periods  of  highest  mentai  demand.  ABM 's  workioad  and  engagement 
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metrics  however  did  not  show  a  significant  change  over  or  within  biocks.  The 
resuits  showed  that  pupii  diameter  shows  promise  as  a  correiate  of  mentai 
workioad. 
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INTRODUCTION 


Augmented  Cognition  emphasizes  the  use  of  a  ciosed-ioop  system  using  reai-time 
physioiogicai  assessment  to  improve  human  performance  (Schmorrow  &Stanney, 
2008).  In  a  training  environment  ciosed-ioop  systems  couid  reduce  the  time 
required  to  train  an  individuai  by  keeping  workioad  at  an  optimai  ievei  for  iearning 
(Coyne,  Baidwin,  Coie,  Sibiey,  &  Roberts,  2009).  Severai  metrics  such  as  pupii 
diameter  and  eiectroencephaiographic  (EEC)  have  been  shown  to  vary  predictabiy 
with  increases  and  decreases  in  workioad.  M  onitoring  these  different  metrics  aiiows 
training  to  be  optimized  in  computer  based  training  (CBT)  environments. 
Uitimateiy,  this  research  wiii  impact  the  way  CBT  is  conducted  by  estabiishing  the 
foundation  for  adaptive  automation  through  monitoring  neurai  resources. 

EEC  and  eye  tracking  metrics  have  been  extensiveiy  investigated  as  a  means  of 
assessing  cognitive  workioad.  For  exampie,  (Berka  et  ai.,  2007)  deveioped  a  mentai 
workioad  metric  based  on  an  individuai’s  EEC  signai  that  tracks  task  demand  in 
mentai  arithmetic  and  digit  span  tasks.  Other  researchers  have  focused  on  eye 
tracking  metrics  and  found  changes  in  pupii  diameter,  fixation  duration,  and  biink 
frequency  to  be  predictive  of  various  ieveis  of  cognitive  demand  in  a  task  (  Tsai, 
Viirre,  Strychacz,  Chase,  &  Jung,  2007;  Van  Orden,  Limbert,  Makeig,  &  Jung, 
2001;  Veitman  &  Gaiiiard,  1996).  Additionaiiy,  many  researchers  have  had  success 
using  artificiai  neurai  networks  (ANN)  to  accurateiy  ciassify  different  operator 
states  for  individuais  (Wiison,  2005;  Wiison  &  Russeii,  2003)  and  improve 
performance  with  theaid  of  adaptive  automation  (Wiison  &  Russeii,  2007). 

Recent  advances  in  eye  tracking  and  EEC  technoiogies  have  made  utiiizing 
ciosed-ioop  systems  based  on  physioiogicai  measures  more  feasibie.  For  exampie, 
accurate  and  unobtrusive  off-the-head  eye  trackers  now  aiiow  and  account  for  head 
movements  and  can  coiiect  and  process  data  in  reai-time.  Furthermore,  technoiogies 
iike  wireiess  EEC  caps  and  dry,  no-prep  eiectrodes  have  recentiy  been  deveioped 
(Christensen,  Estepp,  Wiison,  &  Davis,  2009);  both  of  which  reduce  the  prep  time 
normaiiy  required.  Both  EEC  and  eye  tracking  data  can  aiso  now  be  coiiected  and 
run  using  affordabie  personai  computers  that  are  capabie  of  processing  and  storing 
iarge  amounts  of  data.  These  and  other  simiiar  advances  have  made  it  viabie  to 
utiiize  this  type  of  technoiogy  in  a  CBT  environment. 

The  uitimate  goai  of  this  muiti-year  effort  is  to  buiid  an  automated  training 
environment  where  objective  physioiogicai  metrics  aiong  with  subjective  workioad 
ratings  and  quantifiabie  performance  measures  can  be  used  to  ciassify  an 
individuai’s  workioad  and  guide  desktop  training  simuiations.  The  purpose  of  the 


current  study,  reported  here,  was  to  examine  neurophysiological  markers  of 
workload  in  a  simulated  UAV  task  at  varying  levels  of  difficulty. 


METHOD 


PARTICIPANTS 

All  participants  (N=  15)  were  volunteers  recruited  from  the  Naval  Research 
Laboratory.  None  of  the  participants  had  any  prior  experience  with  UAV 
simulators.  T wo  were  dropped  from  the  study:  one  was  due  to  second  day  attrition 
and  the  other  because  of  partial  dropped  eye  tracking  data.  Therefore,  thirteen 
participant's  eye  tracking  and  performance  data  were  analyzed  and  only  the  last 
nine  participant’s  electroencephalographic  (EEG)  data  were  analyzed  due  to  a  hard 
drive  error  that  caused  four  participant’s  data  to  be  lost. 

MATERIALS 

Advanced  Brain  Monitoring's  (ABM)  wireless  B-Alert  system  was  used  to  collect 
participant’s  EEG  data.  The  system  uses  a  wireless  six  channel  head  cap  that 
transmits  data  via  Bluetooth  to  a  PC  running  ABM's  B-Alert  software.  ABM’s 
classification  algorithms  assessed  raw  EEG  and  provided  a  second  by  second 
workload  and  engagement  metric  on  a  scale  of  0-1.  In  addition,  theTobii  X120  off- 
the-head  eye  tracker  was  used  to  collect  pupil  diameter  and  gaze  position  data.  The 
unit  was  placed  in  front  of  the  participant  and  just  below  the  surface  of  the  monitor 
running  the  simulation.  The  system  recorded  both  eyes  at  120  samples  per  second. 

Virtual  Battlespace  2  (VBS2)  by  Bohemia  Interactive,  Australia  was  used  to 
construct  the  UAV  simulation  scenarios.  VBS2  is  a  high-fidelity,  3-D  virtual 
training  system  used  for  experimental  and  military  training  exercises.  One 
Windows  PC  ran  the  UAV  scenario,  while  a  second  PC  recorded  the  eye  tracking 
data,  and  a  third  recorded  the  EEG  data.  All  computers  were  time  synched  using 
network  time  protocol  in  order  to  ensure  accurate  post-hoc  data  analysis. 


TASKS  AND  PROCEDURES 


UAV  DESKTOP  SIMULATION 

After  receiving  a  brief  PowerPoint  training  about  the  task,  participants  engaged  in  a 
UAV  desktop  simulation  created  from  videos  using  VBS2  where  they  were  trained 
to  report  information  on  enemy  targets  as  seen  from  a  UAV.  A  continuous  video 
stream  from  the  UAV  was  shown  on  the  monitor  (Image  1)  and  participants  were 
asked  to  report  heading  information  about  the  target  vehicles  crossing  the  screen. 
Participants  were  given  the  heading  of  the  UAV  and  were  required  to  estimate  the 


heading  of  the  vehicle  on  the  ground.  A  graphical  depiction  of  a  compass  facing 
due  north  with  30  degree  increments  was  provided  to  the  participant  for  reference. 
After  entering  the  target  heading  estimation,  participants  were  then  asked  to  rate 
their  mental  effort  in  calculating  the  target  heading. 

The  difficulty  of  the  task  progressed  over  four  blocks  of  trials.  Only  one  vehicle 
was  shown  on  the  screen  at  a  time  and  a  total  of  sixteen  vehicles  were  shown  within 
each  block.  Difficulty  was  manipulated  by  varying  the  UAV  heading  as  well  as  the 
possible  target  heading.  For  example,  the  easiest  level  (block  one)  showed  the  UAV 
heading  at  only  0  degrees  and  the  target's  heading  could  be  either  0,  90, 180  or  270 
degrees.  The  most  difficult  level  (block  four)  showed  the  UAV  heading  at  various 
30  degree  increments,  which  changed  after  every  two  targets,  and  the  target  heading 
could  be  any  30  degree  increment. 

Since  this  simulation  is  ultimately  intended  to  help  train  a  UAV  operator,  the 
order  of  difficulty  levels  were  not  randomized.  On  the  first  day,  participants  only 
completed  one  block,  referred  to  as  the  baseline  block,  which  was  the  equivalent 
difficulty  level  of  block  four.  On  the  second  day  of  the  experiment,  participants 
progressed  through  the  task  from  block  one  to  block  four.  This  was  done  in  order  to 
assess  learning,  by  comparing  performance  on  the  baseline  block  and  block  four. 
Each  block  took  approximately  eight  minutes  to  complete. 


IMAGEl.  Screenshot  of  the  UAV  simulation.  Note  the  dust  trail  of  an  enemy  vehicle 
just  to  the  right  of  center.  Based  on  the  given  UAV  heading  of  300,  the  participant 
would  correctly  report  this  vehicle  heading  as  approximately  270°. 


THE  EXPERIMENT 


All  participants  took  part  In  two,  one  hour  sessions  over  two  days.  At  the  beginning 
of  each  day,  participants  were  prepped  for  EEG  recording  with  ABM's  six 
electrode  wireless  headset.  Both  EEG  and  eye  tracking  data  were  collected  while 
participants  were  engaging  with  the  UAV  simulation. 

On  the  first  day,  participants  completed  ABM's  thirty-minute  vigilance  task. 
This  task  was  developed  by  ABM  as  a  means  to  filter  out  noise  and  uniquely  fit 
classification  algorithms  to  a  participant  In  order  to  assess  various  levels  of 
cognitive  state.  The  vigilance  task  and  software  are  part  of  ABM 's  real-time  EEG 
classification  system.  After  completing  that  task,  any  subsequent  EEG  data  was  run 
through  ABM's  classification  algorithm  to  provide  an  Individual's  workload  and 
engagement  In  real-time.  After  this  process,  participants  reviewed  a  PowerPoint 
presentation  that  contained  an  overview  of  the  tasks  and  training  on  how  to 
complete  the  heading  determination  task.  Participants  were  given  a  brief  practice  on 
the  task  and  they  next  completed  the  experimental  baseline  block. 

On  the  second  day,  participants  were  prepped  for  EEG  and  the  experimenter 
briefly  reviewed  the  task  Instructions.  Following  the  Instructions,  participants  began 
the  UAV  simulation  while  participant  performance,  EEG,  and  eye  tracking  data 
were  collected,  along  with  subjective  mental  effort  ratings.  All  participants 
proceeded  from  blocks  one  through  four  with  targets  appearing  at  the  exact  same 
time.  In  the  same  order. 


RESULTS 


BEHAVIORAL  PERFORMANCE 

Analysis  of  performance  data  for  blocks  one  through  four  confirmed  effective 
manipulation  of  difficulty  among  levels  within  the  UAV  simulation.  A  significant 
difference  existed  among  blocks  one  through  four  In  heading  error,  F(3,  36)  = 
16.52,  p  =  .000,  =  .75,  subjective  workload  ratings,  F(3,  36)  =  43.47,  p  =  .000, 

=  .78,  and  for  errors  of  omission,  F(3,  36)  =4.50,  p  =  .006,  =  .29.  Heading  error 

was  computed  by  dividing  the  error  from  correct  heading  answer  by  180  degrees: 
subjective  ratings  were  on  a  scale  of  one  to  seven;  and  errors  of  omission  were 
averaged  over  the  entire  block.  See  Figure  1  for  a  depiction  of  these  effects. 

While  a  statistically  significant  difference  does  not  exist  between  heading  error 
on  the  baseline  block  (M  =  0.16,  SD  =  0.08)  and  block  four  (M  =  0.13,  SD  =  0.07),  the 
average  error  did  decrease  slightly  and  errors  of  omission  decreased  from  1.69  on 
the  baseline  block  to  0.92  on  block  four. 
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FIGURE  1.  Average  heading  error  and  subjective  workioad  ratings  across  aii  biocks 


ABM'S  WORKLOAD  AND  ENGAGEMENT  INDICES 

Preliminary  analysis  of  the  ABM  workload  and  engagement  metrics  showed  almost 
identical  levels  of  workload  and  engagement  when  the  metrics  were  averaged 
within  each  block  and  then  compared  across  block  levels.  Thus,  we  further 
investigated  the  metrics  by  averaging  each  classification  over  the  three  seconds 
preceding  participant  response  for  each  target  heading.  This  time  was  chosen 
because  it  should  correspond  with  when  the  participant  is  calculating  the  target 
heading,  and  thus  is  most  cognitively  loaded.  Still,  results  revealed  no  significant 
difference  in  the  ABM 's  workload  metric  across  blocks  one  through  four,  F(3,  24) 
=  1.62,  p  =  .211.  Similarly,  no  significant  difference  existed  in  ABM 's  engagement 
metric  across  blocks  one  through  four,  F  (3,  24)  =  1.41,  p  =  .265.  See  Figure  2. 


ABM's  Cognitive  State  Indices  Across  Blocks 
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FIGURE  2.  ABM's  engagement  and  workload  indices  averaged  three  seconds  prior 
to  participants  providing  their  heading  response 


PUPILLOMETRY 


Pupil  dilation  was  investigated  as  a  measure  of  mental  workload,  and  consequently 
pupil  diameter  was  averaged  within  each  entire  block  and  compared  across 
difficulty  levels.  Analysis  revealed  significant  differences  in  pupil  diameter  among 
blocks  one  through  four  for  the  left  eye  (F(3,  36)  =  6.9,  p  =  .005,  =  .37)  as  well 

as  the  right  eye  (F (3,  36)  -6.9,  p  =  .008,  *^  =  .37),  as  shown  in  Figure  3.  Analysis 
of  the  pupil  diameter  between  the  baseline  and  block  four  yielded  some  interesting 
results.  A  significant  difference  between  the  baseline  (M  =  3.34,  5D  -  .46)  and  block 
four  (M  =  3.25,  5D  -  .39)  did  exist  for  the  left  eye,  F(l,  12)  =  5.18,  p  =  .042,  = 

.30.  However  no  significant  differences  existed  between  the  baseline  (M=  3.34, 
5D-  .36)  and  block  four  (M  =  3.31,  SD=  .40)  for  the  right  eye,  F(l,  12)  =  0.17,  p  = 
.688,  =  .02. 

Further  investigation  of  pupil  dilation  also  prompted  averaging  pupil  size  over 
the  immediate  seconds  preceding  participant  response  for  each  target  heading. 
Increments  of  one,  three,  and  ten  seconds  were  investigated  and  all  yielded  similar 
results.  In  particular,  pupil  diameter  across  blocks  one  through  four  was 
significantly  larger  one  second  preceding  heading  response  when  averaged  across 
the  whole  block  for  the  left  eye,  F(l,  12)  =  64.96,  p  =  .000,  =  .84,  and  the  right 

eye,  F(l,  12)  =  88.11,  p  =  .000,  =  .88.  This  suggests  that  pupil  dilation  is 

sensitive  to  phasic  changes  in  workload  over  a  small  amount  of  time  and  confirms 
pupil  dilation  as  a  highly  promising  correlate  of  workload.  See  Figure  4  for  a 
comparison  of  the  different  average  time  increments. 


Average  Pupil  Dilation  Difference  Scores 
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FIGURE  3.  Average  of  all  participants'  pupil  size  difference  from  his  or  her  average 
pupil  size  for  each  block 


Pupil  Dilation  prior  to  Heading  Response  compared 
to  Average  over  entire  Block 
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FIGURE  4.  Average  pupil  dilation  averaged  one  second  prior  to  heading  response 
compared  to  pupil  dilation  averaged  over  the  entire  block 


DISCUSSION 


Analysis  of  the  performance  data  and  subjective  workload  ratings  Indicated  that  the 
various  levels  of  difficulty  were  successfully  manipulated  across  the  task. 
Subjective  workload  ratings,  errors  of  omission,  and  heading  error  all  Increase  In 
accordance  with  Increasing  levels  of  difficulty  (I.e.  from  block  one  to  block  four). 
While  a  significant  difference  does  not  exist  between  heading  error  for  the  baseline 
block  and  block  four,  about  half  as  many  errors  of  omission  occurred  on  block  four 
(I.e.  a  failure  to  respond  due  to  time  pressure  or  simply  not  knowing  the  answer). 
Thus,  heading  errors  on  block  four  could  be  Influenced  by  fewer  omissions,  and 
therefore  be  slightly  higher  than  If  the  omission  rate  were  the  same  between  blocks 
four  and  the  baseline  block. 

Comparison  of  performance  between  the  baseline  and  block  four  are  of  Interest 
as  a  means  of  assessing  the  UAV  simulation  as  a  potential  training  simulation.  Due 
to  potentially  a  lack  of  power  and  other  factors,  pre  (baseline  block)  and  post  (block 
four)  test  training  effects  weren't  statistically  different.  However,  the  total  time 
allotted  to  training  was  only  about  forty  minutes  over  both  days,  since  each  block 
took  about  eight  minutes  to  complete.  Hence,  with  more  time  to  train  an  Individual 
at  each  level  In  a  real  world  training  simulation,  one  would  expect  to  see  smaller 
heading  errors  and  errors  of  omission  by  the  end  of  training,  compared  to  the 
baseline  trial.  In  addition,  one  would  expect  ratings  of  workload  to  be  significantly 
lower  on  the  post  test  than  the  pre  test. 

Neither  ABM 's  workload  Index  nor  engagement  Index  were  sensitive  to  changes 
In  this  task  across  difficulty  levels.  Changes  were  also  not  apparent  when  the  Index 
was  calculated  three  seconds  prior  to  heading  response,  when  workload  and 
engagement  should  have  been  highest  within  the  block.  On  account  of  these 


findings,  future  studies  wiii  not  be  using  ABM's  cognitive  state  ciassifi cation 
aigorithms,  but  instead  wiii  investigate  the  use  of  artificiai  neurai  networks  as  a 
means  of  assessing  workioad  in  a  UAV  training  simuiation. 

The  most  promising  resuits  of  this  study  were  systematic  changes  in  pupii 
diiation  as  a  function  of  difficuity  ievei.  The  initiai  anaiysis  of  pupii  diameter  was 
performed  by  averaging  an  individuai’s  pupii  diameter  over  each  eight  minute 
biock.  Simpiy  comparing  average  biock  pupii  diiations  yieided  significant 
differences  in  pupii  size  across  biocks  (see  Figure  3).  Further  investigation  showed 
that  average  pupii  size  one  second  prior  to  submitting  heading  response  was 
significantiy  higher  compared  to  pupii  size  during  the  rest  of  the  biock  (see  Figure 
4).  This  pre-response  computation  was  aiso  caicuiated  at  three  and  ten  seconds 
preceding  response,  and  yieided  simiiar  effects:  indicating  that  this  effect  was  iikeiy 
not  due  to  some  kind  of  response  initiation.  Therefore,  pupii  diiation  is  not  oniy 
sensitive  to  changes  in  workioad  over  iarge  periods  of  time,  but  aiso  is  sensitive 
within  the  demands  of  a  task.  These  resuits  substantiate  the  robustness  of  pupii 
diiation  as  a  means  of  assessing  cognitive  ioad. 

One  surprising  resuit  was  the  iarge  difference  between  average  ieft  and  right 
pupii  diameter  for  the  baseiine  biock,  that  actuaiiy  yieided  differing  resuits  when 
comparing  diiation  between  the  baseiine  biock  and  biock  four.  Left  eye  data  is 
consistent  with  research  that  suggests  differences  in  workioad  across  difficuity 
ieveis  shouid  diminish  with  practice  (Berka  et  ai.,  2004).  Flowever,  data  from  the 
right  eye  wouid  suggest  that  this  is  not  the  case.  At  present,  further  investigation  is 
necessary  before  any  firm  conciusions  can  be  drawn. 

Future  studies  are  pianned  to  investigate  how  measures  of  workioad  change  with 
practice  within  a  difficuity  ievei.  In  particuiar,  other  eye  tracking  metrics,  such  as 
biink  frequency/duration,  fixation  frequency/duration,  and  divergence  wiii  be 
investigated.  Anaiysis  of  biink  data  were  not  possibie  for  this  study,  due  to  the 
inabiiity  to  reiiabiy  differentiate  iost  eye  tracking  data  from  biinks  using  theTobii 
eye  tracking  system.  Future  studies  wiii  use  EOG  to  soive  this  probiem.  Fixation 
data  and  nearest  neighbors  anaiyses  aiso  were  not  possibie  to  anaiyze  because  of 
too  much  error  in  theTobii  caiibration.  This  probiem  has  been  resoived  with  new 
software  that  wiii  be  incorporated  into  future  studies. 

Another  area  of  interest  wiii  be  coiiecting  physioiogicai  data  when  a  participant 
is  overioaded.  We  intend  to  increase  the  difficuity  ievei  of  the  hardest  biock  in 
order  to  purposeiy  overioad  the  participant.  Additionaiiy,  fewer  biocks  wiii  be 
necessary  since  it  is  difficuit  to  distinguish  four  distinct  ieveis  in  the  performance 
data.  Three  ieveis  with  moretriais  in  each  ievei  wiii  be  used  in  foiiow  up  studies. 

Overaii,  these  findings  show  promise  for  using  pupii  diameter  as  a  means  of 
assessing  workioad.  More  data  coiiection  is  necessary  to  investigate  other  eye 
tracking  and  EEG  correiates.  Using  spectrai  anaiysis  of  the  EEG  recordings  may 
prove  more  sensitive  than  the  ABM  engagement  index  expiored  in  the  present 
study.  Uitimateiy,  with  the  combination  of  performance,  subjective  ratings,  eye 
tracking  data,  and  EEG,  we  are  confident  that  we  wiii  be  abie  to  successfuiiy 
predict  user  workioad  and  eventuaiiy  perform  mitigations  within  a  ciosed  ioop 
system. 
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